MaizeGDB
Download



  • Home
  • About

    Welcome to MaizeGDB!

    MaizeGDB is a community-oriented, long-term, federally funded informatics service to researchers focused on the crop plant and model organism Zea mays.

    MaizeGDB is a founding member of AgBioData, a consortuim of agriculture-related online resources which is committed to making agriculture-related research data FAIR.

    Project

    • Cite us
    • Contact us
    • Working Group

    Outreach

    • FAQs
    • NCGA podcasts

    Helpful Links

    • Project documentation
    • News
    • Site Map
  • Community

    Maize genetics community

    Maize Genetics Cooperation - MGC

    • MGC website
    • Advocacy
    • Awards
    • Maize Meeting
    • The Kernel


    Articles

    • Find a paper
    • Editorial Board picks
    • Classic reads
    • Maize newsletter (MNL)
    • Videos


    Data

    • FAIR data at MaizeGDB
    • Contribute data
    • Downloads
    • Nomenclature


    Resources

    • Find researchers
    • Project documentation
    • Jobs
    • Cooperator history
    • Maize research history
    • Maize history




    Maize Genetics Meeting

    The 68th Annual Maize Genetics Meeting will be held from February 26th - March 1st, 2026 at the Maritim Hotel, Cologne, Germany.

    The 67th Annual Maize Genetics Meeting was held from March 6th - 9th, 2025 at the Union Station, St. Louis, Missouri, USA.


    Archive

    Website
    Abstracts
    2024
    2024
    2023
    2023
    2022
    2022
    2021
    2021
    2020
    2020
    2019
    2019
    more...
  • Genomes
    Representative genome - B73
    Zm-B73-REFERENCE-NAM-5.0
    Zm-B73-REFERENCE-GRAMENE-4.0
    B73 RefGen_v3
    B73 RefGen_v2
    B73 RefGen_v1
    BAC-based B73

    W22 genome assemblies
    Zm-W22-REFERENCE-NRGENE-2.0
    Genome assembly collections
    Amaizing genomes
    European Flints
    Pan-Andropogoneae
    NAM parents
    Highland-lowland genomes
    Chinese inbred founder lines


    Visualizations

    Whole genome views of B73 and NAM founders
    Karytypes of 14 inbred lines
    Karytypes of Mo17 and B73


    Complete collection of genome assemblies

  • Tools

    Featured tools at MaizeGDB

    A resource to BLAST your sequence against maize genomes and annotations.
    A data warehouse to access genomic, proteomic, and literature data for maize.
    A platform to compare gene and protein expression data across multiple genomes.
    A set of comprehensive metabolic pathway resources for maize.

    New tools at MaizeGDB


    Fusariuim Protein Toolkit
    A set of tools and datasets to explore the functions and structures of the Fusarium proteome.

    Phylostrata
    Phylostratigraphy determines the level of evolutionary conservation of a given protein.

    PanEffect
    PanEffect: A tool to explore the variant effects across the maize pan-genome.

    SNPversity 2.0
    A tool to access large-scale maize genetic variation data.

    Foldseek Search
    Fast comparisons of large protein structure sets.

    Other tools at MaizeGDB

    • SNPversity 2.0
    • SNPversity: A tool to access large-scale maize genetic variation data.
    • TYPSimSelector
    • TYPSimSelector: A tool that sorts identity-by-state values derived from SNP datasets for accessions in the Ames Diversity Panel (a USDA-maintained set of maize lines representing wide genetic variation in Ames, Iowa).
    • PanEffect
    • PanEffect: A tool to explore the variant effects across the maize pan-genome.
    • Bin viewer
    • Bin Viewer: A tool to explore data in regions defined by genetic bins.
    • GenomeQC
    • GenomeQC: A tool to assess the quality of genome assemblies and annotations.
    • Maize Feature Store
    • MFS: A centralized resource to manage and analyze curated maize multi-omics features for machine learning applications.
    • Pedigree Viewer
    • Pedigree Viewer: A tool to visualize maize pedigree data and networks.
    • PAST
    • PAST: A tool that assigns your SNPs to genes and your genes to metabolic pathways.
    • Genome Context Viewer
    • GCV: A web-app that visualizes genomic context data in a single, federated interface by using functional annotations as a unit of search and comparison.
  • Data Centers

    A-I

    • Alleles/Polymorphisms
    • SNPs/Traits
    • Expression
    • Gene/Gene Models
    • Gene Products
    • Genomes
    • Images
    • Insertions

    L-Z

    • Loci + QTL
    • Maps
    • Metabolic Pathways
    • Molecular Markers
    • Mutants & Phenotypes
    • Pan-genes
    • References
    • Stocks

    Archived data centers
  • Explore MaizeGDB
  • Feedback
  • Hide me!

Home> Genome Center> B73 RefGen_v3

B73 RefGen_v3 genome assembly
Project Details Metadata Browser

The Zea mays ssp mays cv B73 Reference Genome

Report a gene model error

B73 Representative Reference Genome Assembly Status
B73 Representative Reference Genome Assembly Details
Change History
B73 Representative Reference Gene Models and Nomenclature
Gramene Versions
Genome Assembly and Gene Model Issues
B73 Stock Information
Chromosome names - Genbank accessions
Downloads
History of Maize Genome Assemblies and Annotations
The Nomenclature Standards
Publications
FAQs
Coming Soon
Historic information


Download v5 assembly sequence

Download v5 gene model FASTA

Download v5 gene model GFF

See v5 Data at GenBank

Download other versions

The Maize B73 Representative Reference Genome

The maize B73 reference genome has been revised four times since its initial release as a BAC-by-BAC assembly in 2009. As of 2016, the maize nomenclature committee has adopted naming standards to accommodate multiple Zea species, multiple accessions, and multiple versions. This recommendation is available here. The B73 reference assemblies have been known by these names:
assembly assembly aka annotation annotation aka Gramene/
EnsemblPlant
version
Zm-B73-REFERENCE-NAM-5.0 B73 RefGen_v5 Zm00001eb.1
Zm-B73-REFERENCE-GRAMENE-4.0 B73 RefGen_v4, AGPv4 Zm00001d.2 AGPv4 32/50 - 40/58
B73 RefGen_v3 B73 RefGen_v3, AGPv3 5b+ AGPv3 18/36 - 31/49
B73 RefGen_v2 B73 RefGen_v2, AGPv2 5b 5b.60, AGPv2 7/25 - 17/37
B73 RefGen_v1 B73 RefGen_v1, AGPv1 4a 4a.53, AGPv1


B73 Representative Reference Genome Assembly Status

The current representative reference genome for Maize is B73 Zm-B73-REFERENCE-NAM-5.0 (also known as RefGen_v5).


The current B73 assembly version, Zm-B73-REFERENCE-NAM-5.0, released in January 2020, was sequenced and assembled along with a set of 25 inbreds known as the NAM founder lines by the NAM Consortium using PacBio long reads and mate-pair strategy. Scaffolds were validated by BioNano optical mapping, and ordered and oriented using linkage and pan-genome marker data. RNA-seq data from multiple tissues were used to annotate each genome using a pipeline that includes BRAKER, Mikado, and PASA.


The first three assemblies, B73 RefGen_v1, B73 RefGen_v2, and B73 RefGen_v3 were all based on a BAC (bacterial artificial chromosome) sequencing strategy. B73 RefGen_v4 assembly used a new approach that relied on PacBio Single Molecule Real Time (SMRT) sequencing at Cold Spring Harbor to a depth of 60X coverage with scaffolds created via the assistance of whole genome restriction mapping (aka Optical Mapping). Error correction of PacBio sequences was facilitated by Illumina short read DNA sequencing performed at Washington University. Annotation was accomplished in the Ware laboratory at Cold Spring Harbor using the Maker pipeline (Campbell, 2014) and ~111,000 long read PacBio transcipts from six maize tissues. More complete details in the B73 RefGen_v4 assembly can be found at Gramene or by reading the paper.


See the History of Maize Genome Assemblies and Annotations for more information.




B73 Representative Reference Genome Assembly Details

The current version is Zm-B73-REFERENCE-NAM-5.0, also known as "B73 RefGen_v5".

Chromosomes
The assembly sequence includes all 10 chromosomes.
The sequence can be downloaded here, from ENA or from GenBank

Gaps
Gaps within BACs are indicated by runs of 100 N's. Gaps between contigs are indicated by runs of 1000 N's.



Zm-B73-REFERENCE-NAM-5.0/Zm00001eb.1 Information

In-depth metadata for Zm-B73-REFERENCE-NAM-5.0 is available here.
See the paper for B73 RefGen_v1 here, and for Zm-B73-REFERENCE-GRAMENE-4.0 here.

Counts for each chromosome.
Chromosome Accession Length Protein Coding Transposable Element
Chromosome 1 LR618874.1 308,452,471 5892 227,345
Chromosome 2 LR618875.1 243,675,191 4751 176,504
Chromosome 3 LR618876.1 238,017,767 4103 173,251
Chromosome 4 LR618877.1 250,330,460 4093 183,689
Chromosome 5 LR618878.1 226,353,449 4485 160,922
Chromosome 6 LR618879.1 181,357,234 3412 129,220
Chromosome 7 LR618880.1 185,808,916 3070 141,993
Chromosome 8 LR618881.1 182,411,202 3536 130,992
Chromosome 9 LR618882.1 163,004,744 2988 117,200
Chromosome 10 LR618883.1 152,435,371 2705 112,766
Unmapped 5892 23,216
Nuclear Total ~2,182,000 39,756 1,577,104
Annotations: Zm00001eb.1 NCBI 103


Zm-B73-REFERENCE-NAM-5.0/Zm00001eb.1 Stats


Gene Feature Value
Average protein-coding transcript size 5376 bp
Longest transcript: 745,091 bp (Zm00001eb334630_T004)
Average transposable element size 1638 bp
Average Exon size 290 bp
Average Number of exons per gene 6 exons
Maximum exons per gene 80 exons (Zm00001eb126710_T002)
Average Coding region size 1816 bp


Previous reference genome assemblies

(Zm-B73-REFERENCE-GRAMENE-4.0)


Zm-B73-REFERENCE-GRAMENE-4.0/Zm00001d.2 Information

In-depth metadata for Zm-B73-REFERENCE-GRAMENE-4.0 is available here.
See the paper for B73 RefGen_v1 here, and for Zm-B73-REFERENCE-GRAMENE-4.0 here.

Counts for each chromosome.
Chromosome Accession Protein Coding miRNA Transposable Element Low Confidence
Chromosome 1 NC_024459.2 5905 14 2209
Chromosome 2 NC_024460.2 4737 22 2209
Chromosome 3 NC_024461.2 4737 16 1571
Chromosome 4 NC_024462.2 4115 20 1826
Chromosome 5 NC_024463.2 4480 24 1681
Chromosome 6 NC_024464.2 3290 11 1223
Chromosome 7 NC_024465.2 3108 10 1193
Chromosome 8 NC_024466.2 3561 13 1288
Chromosome 9 NC_024467.2 2973 7 1191
Chromosome 10 NC_024468.2 2684 17 1034
Unmapped 319 0 357
Nuclear Total 39,324 154 15,516
Annotations: Zm00001d.2


Zm-B73-REFERENCE-GRAMENE-4.0/Zm00001d Stats


Gene Feature Value
Average protein-coding transcript size 7638 bp
Average low confidence transcript size 6981 bp
Average transposable element size unavailable
Average Exon size 156 bp
Average Number of exons per gene 4 exons
Maximum exons per gene 81 exons (Zm00001d040166)
Average Intron size 578 bp
Average Coding region size 207 bp

Assembly process: In-depth metadata for B73 RefGen_v3 is available here.
Detailed information about the V3 assembly process is available at .

B73 RefGen_v3 Information

Counts for each chromosome.
Chromosome Accession Protein Coding miRNA Transposable Element Low Confidence
Chromosome 1 NC_024459.1 6007 15 4296 6044
Chromosome 2 NC_024460.1 4742 23 3582 4997
Chromosome 3 NC_024461.1 4174 16 3093 4352
Chromosome 4 NC_024462.1 4182 21 3668 4688
Chromosome 5 NC_024463.1 4473 22 3103 4249
Chromosome 6 NC_024464.1 3278 10 2502 3430
Chromosome 7 NC_024465.1 3115 10 2424 3274
Chromosome 8 NC_024466.1 3505 13 2593 3508
Chromosome 9 NC_024467.1 2991 8 2404 3288
Chromosome 10 NC_024468.1 2688 18 2268 2734
Unmapped 146 0 51 59
Nuclear Total 39,475 156 29,996 40,680
Annotation: B73 RefGen_v3 gene model set 5b+


B73 RefGen_v3 Stats


Gene Feature Value
Average protein-coding transcript size 4255 bp
Average low confidence transcript size 959 bp
Average transposable element size 1694 bp
Average Exon size 287 bp
Average Number of exons per gene 3.6 exons
Maximum exons per gene 35 exons (GRMZM2G068755_T01)
Average Intron size 630 bp
Average Coding region size 213 bp

B73 RefGen_v2 Information In-depth metadata for B73 RefGen_v2 is available here.

Counts for each chromosome.
Chromosome Working Gene set (WGS) WGS Transcript Filtered gene set (FGS) model FGS Transcript
Chromosome 1 16,344 20,556 6,056 9,899
Chromosome 2 13,284 16,387 4,766 7,485
Chromosome 3 11,613 14,383 4,197 6,650
Chromosome 4 12,517 15,463 4,197 6,822
Chromosome 5 11,828 14,920 4,503 7,319
Chromosome 6 9,207 11,458 3,293 5,263
Chromosome 7 8,813 10,965 3,147 5,081
Chromosome 8 9,633 12,085 3,531 5,695
Chromosome 9 8,347 10,313 2,920 4,690
Chromosome 10 7,718 9,463 2,727 4,274
Unmapped 157 170 52 62
Nuclear Total 109,461 136,163 39,389 63,240
Mitochondria 171 175 124 127
Chloroplast 72 73 57 58
Total 109,704 136,411 39,570 63,425
Annotation: B73 RefGen_v2: Release 5b.60

B73 RefGen_v2 Stats


Gene Feature Value
Average WGS transcript size 2646 bp
Average FGS transcript size 4237 bp
Average Exon size 287 bp
Average Number of exons per gene 3.6 exons
Maximum exons per gene 53 exons (GRMZM2G068755_T01)
Average Intron size 629 bp
Average Coding region size 210 bp
Average 5' UTR average length 280 bp
Average 3' UTR average length 336 bp

B73 RefGen_v1 Information In-depth metadata for B73 RefGen_v1 is available here.



Change history

B73 RefGen_v1

First complete assembly of the B73 genome.

B73 RefGen_v2

Improvements to order and orientation of within-BAC contigs using the minimum tiling path (MPT). Improvements to gene models.

B73 RefGen_v3

Captured missing gene space using WGS reads. 213 new gene models were introduced, 251 gene models were improved, and 10 gene models were merged to create new models:
GRMZM2G000964, GRMZM2G103315 -> GRMZM2G000964
GRMZM2G045892, GRMZM2G452386 -> GRMZM2G045892
GRMZM2G119720, GRMZM2G518717 -> GRMZM2G119720
GRMZM2G142383, GRMZM2G020429 -> GRMZM2G142383
GRMZM2G319465, GRMZM2G439578 -> GRMZM2G319465
GRMZM2G338693, GRMZM2G117517 -> GRMZM2G338693
GRMZM5G861997, GRMZM5G864178 -> GRMZM5G861997
GRMZM5G872800, GRMZM2G143862 -> GRMZM5G872800
GRMZM5G891969, GRMZM5G823855 -> GRMZM5G891969

Zm-B73-REFERENCE-GRAMENE-4.0

A de novo assembly using PacBio technologies. New annotation analysis with gene models linked to v3 gene models.

Zm-B73-REFERENCE-NAM-5.0

De novo Pac-Bio SEQUEL sequencing technology. Scaffolds validated by improved BioNano optical mapping. New annotation analysis.


B73 Reference Gene Models and Nomenclature


With increasing numbers of full reference genomes with structural annotation becoming available, it has become necessary to establish naming standards that span genomes and versions. The recommendation is available here.

Important note: The B73 v5 and NAM founders genome assemblies were released with a preliminary annotation, named "Zm00001e.1". The official annotation was sufficiently different from the preliminary annotation that it was given a new name, "Zm00001eb.1" and the gene models were given new identifiers. Unlike previous annotations, the new identifiers are numbered in sequential order.

Cross-reference files that translate between the preliminary and official annotations can be found in the MaizeGDB downloads. The cross-reference for Zm00001e.1 to Zm00001eb.1 is here.

The current reference gene model set is named Zm00001eb.1. Gene models within this set are prefixed with "Zm00001eb". Associations between the v4 gene models (Zm0001d.2) and the 5b+ gene models are available here.


Gene model sets (annotations) by reference assembly version:

gene model set description assembly version Gramene version cross reference
Zm00001eb.1 Official v5 annotation Zm-B73-GRAMENE-NAM-5.0 64 xref
Zm00001e.1 Preliminary and replaced by Zm00001eb.1 Zm-B73-GRAMENE-NAM-5.0 N/A
Zm00001d.2 Filtered Gene Set Zm-B73-GRAMENE-REFERENCE-4.0 36 xref
Zm00001d.1 Filtered Gene Set Zm-B73-GRAMENE-REFERENCE-4.0 32-33 xref
5b+ Filtered Gene Set, mostly projections of 5b RefGen_v3 18-31 xref
5a Working Gene Set (WGS) RefGen_v2 7-17
5b Filtered Gene Set (FGS) - subset of WGS RefGen_v2 7-17
4a.53 Filtered and Working gene sets RefGen_v1

The Zm00001eb.1 gene model set is the recommended gene model set for Zm-B73-GRAMENE-REFERENCE-5.0 and is the representative gene model set for maize.

For RefGen_v3, the 5b+ gene model set is recommended. Other gene model sets for RefGen_v3 are provided for comparison. Due to the difficulty of determining when two gene models are the same (or when one represents an alternative splicing of the same genomic material), there are no plans to merge the sets.


For more information see the Nomenclature Standards



Alternative annotations

Additional annotations for the B73 genome assemblies have been generated by groups outside the genome sequencing project. The outside annotations listed below are shown as tracks on the assembly browsers.
Assembly Name Source link
Zm-B73-REFERENCE-NAM-5.0 NCBI 103 NCBI https://www.ncbi.nlm.nih.gov/genome/annotation_euk/Zea_mays/103/
Zm-B73-REFERENCE-GRAMENE-4.0 NCBI 102 NCBI https://www.ncbi.nlm.nih.gov/genome/annotation_euk/Zea_mays/102/
B73 RefGen_v3 NCBI 100 NCBI https://www.ncbi.nlm.nih.gov/genome/annotation_euk/Zea_mays/100/
B73 RefGen_v3 EvidentialGene Don Gilbert, Indiana University http://arthropods.eugenes.org/EvidentialGene/plants/corn/evg5corn/

Description of Gramene/Ensembl versions of B73 genome download files

Versions supported by MaizeGDB is in bold

No changes were made to unmasked or masked assembly downloads unless noted.


Ensembl Gramene Assembly Gene Model Set Date Changes
7-17 25-35 B73 RefGen_v2 5b 11/30/10 - 03/10/13 not calculated
18 36 B73 RefGen_v3 5b+ 04/29/13 Initial B73 v3 downloads
19 37 B73 RefGen_v3 5b+ 07/08/13 GFF downloads available
20 38 B73 RefGen_v3 5b+ 09/10/13 17625 additional gene models [more]
21 39 B73 RefGen_v3 5b+ 01/16/14 2830 gene models removed, additional repeated masked files [more]
22 40 B73 RefGen_v3 5b+ 04/09/14 70676 models removed (WGS) [more]
23 41 B73 RefGen_v3 5b+ 09/01/14 166 gene models removed [more]
24 42 B73 RefGen_v3 5b+ 11/24/14 None
25 43 B73 RefGen_v3 5b+ 02/03/15 None
26 44 B73 RefGen_v3 5b+ 04/06/15 No repeat masked sequence in downloads
27 45 B73 RefGen_v3 5b+ 06/18/15 Repeat masked sequence returned to downloads
28 46 B73 RefGen_v3 5b+ 08/18/15 None
29 47 B73 RefGen_v3 5b+ 10/27/15 None
30 48 B73 RefGen_v3 5b+ 01/14/16 None
31 49 B73 RefGen_v3 5b+ 07/23/17 Repeat-masked gene model downloads
32 50 Zm-B73-REFERENCE-GRAMENE-4.0 Zm00001d.1 08/04/16 Initial B73 v4 downloads
33 51 Zm-B73-REFERENCE-GRAMENE-4.0 07/14/17 174 additional gene models (organelle), changes in 5825 gene models [more], Mt and Pt sequence added to assembly downloads, new repeat masked assembly FASTA files (%Ns changed from 85.12% to 6.69%).
34 52 Zm-B73-REFERENCE-GRAMENE-4.0 12/18/16 Changes in 154 gene models [more], no changes to assembly FASTA files.
35 53 Zm-B73-REFERENCE-GRAMENE-4.0 04/26/17 Changes in 787 gene models [more], no changes to assembly FASTA files.
36 54 Zm-B73-REFERENCE-GRAMENE-4.0 Zm00001d.2 06/27/17 Changes in 154 gene models [more], no changes to assembly FASTA files.
37 55 Zm-B73-REFERENCE-GRAMENE-4.0 10/05/17 1861 additional gene models including Rfam predictions, changes to 154 gene models [more], no changes to assembly FASTA files
38 56 Zm-B73-REFERENCE-GRAMENE-4.0 02/07/18 70 gene models removed, 7 added, changes to 1945 gene models [more], new repeat masked assembly FASTA files (%Ns chaged to 90.22)
39 57 Zm-B73-REFERENCE-GRAMENE-4.0 05/15/18 New repeat masked assembly FASTA files (%Ns chaged to 90.01)
40 58 Zm-B73-REFERENCE-GRAMENE-4.0 05/15/18 New repeat masked assembly FASTA files (%Ns chaged to 90.22)
41 59 Zm-B73-REFERENCE-GRAMENE-4.0 09/20/18 174 GRMZM ids for organelle gene models removed, 364 organelle gene models added.

B73 Reference Genome Assembly and Gene Model Issues

We need your help! Please report any assembly or gene model structure problems. This includes misassembled regions, evidence for closing gaps, gene models that should be merged or split, evidence supporting low-confidence gene models, et cetera. All issues will be shared with the maize community and with the team charged with improving the B73 assembly and gene models.


Please contact MaizeGDB for information about open gene model issues
All resolved gene model issues


All open assembly issues
All resolved assembly issues



B73 Stock Information

The seed source for both Zm-B73-REFERENCE-NAM-5.0 and Zm-B73-REFERENCE-GRAMENE-4.0 descended from PI 550473, but was maintained for several generations prior to being used as the source seed. The seeds closest to those used for sequencing v4 were deposited at the NCRPIS (accession number: PI 677128).

The B73 source for the BAC libraries (BACs with prefix "b" prepared in Rod Wing's lab; BACs with prefix "c" prepared in Peter deJong's lab) was PI 550473. When requesting seed from the North Central Regional Plant Introduction Station, ask for any lot descended from the Coe PI 550473 lines.

The stock was received directly by the North Central Regional Plant Intoduction Station from Arnel Hallauer and has been maintained by the quality-maintenance procedures at the PI Station. Ed Coe reports that, "The results of QC lab checks for constancy in PI 550473 have been excellent."

The same source was used for the IBM mapping population. Maps produced at Missouri used 302 lines of this population, providing unmatched precision (resolution is at the intra-BAC level). These maps anchor the fingerprint-based contig assemblies to chromosome location.

High-Molecular-Weight DNA was prepared by Jack Gardiner in the lab at Missouri and shipped to Clemson (Wing's lab at the time) and to deJong's lab (just at the time his lab was moving to California) for BAC preparation.

NSF grant reports have documented the details, and specifics for the materials, preparation, characterization, and final assembly of the contig framework can be found in Coe E, Schaeffer ML (2005) Genetic, physical, maps, and database resources for maize. Maydica 50:285-303. Ed Coe has made a copy of that paper available here


Chromosome - Genbank accessions reference

Chromosome B73 RefGen_v1 B73 RefGen_v2 B73 RefGen_v3 B73 RefGen_v4 B73 RefGen_v5 Publication
Chromosome 1 GK000031.1 GK000031.2 GK000031.3 CM007647.1 LR618874.1 PubMed, MaizeGDB
Chromosome 2 GK000032.1 GK000032.2 GK000032.3 CM007648.1 LR618875.1 PubMed, MaizeGDB
Chromosome 3 GK000033.1 GK000033.2 GK000033.3 CM007649.1 LR618876.1 PubMed, MaizeGDB
Chromosome 4 CM000780.1 CM000780.2 CM000780.3 CM000780.4 LR618877.1 PubMed, MaizeGDB
Chromosome 5 CM000781.1 CM000781.2 CM000781.4 CM000781.4 LR618878.1 PubMed, MaizeGDB
Chromosome 6 CM000782.1 CM000782.2 CM000782.3 CM000782.4 LR618879.1 PubMed, MaizeGDB
Chromosome 7 GK000034.1 GK000034.2 GK000034.3 CM007650.1 LR618880.1 PubMed, MaizeGDB
Chromosome 8 CM000784.1 CM000784.2 CM000784.3 CM000784.4 LR618881.1 PubMed, MaizeGDB
Chromosome 9 CM000785.1 CM000785.2 CM000785.3 CM000785.4 LR618882.1 PubMed, MaizeGDB
Chromosome 10 CM000786.1 CM000786.2 CM000786.3 CM000786.4 LR618883.1 PubMed, MaizeGDB

WGS (Whole Genome Shotgun) records at GenBank:
Zm-B73-REFERENCE-GRAMENE-4.0
Zm-B73-REFERENCE-NAM-5.0

B73 Assembly and Gene Model Downloads

Gramene files currently hosted at MaizeGDB correspond to Gramene version 36. See the summary of Gramene versions above.

assembly datasest Gramene version(s)
Zm-B73-REFERENCE-NAM-5.0 assembly and annotations
Zm-B73-REFERENCE-GRAMENE-4.0 assembly 32-40
Zm-B73-REFERENCE-GRAMENE-4.0 gene model cDNA fasta 36
Zm-B73-REFERENCE-GRAMENE-4.0 gene model ncRNA fasta 36
Zm-B73-REFERENCE-GRAMENE-4.0 gene model CDS fasta 36
Zm-B73-REFERENCE-GRAMENE-4.0 gene model translations fasta 36
Zm-B73-REFERENCE-GRAMENE-4.0 gene model GFF3 36
B73 RefGen_v3 assembly 18-31
B73 RefGen_v3 gene model cDNA fasta 18-31
B73 RefGen_v3 gene model ncRNA fasta 18-31
B73 RefGen_v3 gene model translations fasta 18-31
B73 RefGen_v3 gene model GFF3 18-31
B73 RefGen_v3 MAKER-P gene models n/a
B73 RefGen_v2 assembly 7-17
B73 RefGen_v2 filtered gene models 7-17
B73 RefGen_v2 working gene set 7-17
B73 RefGen_v2 functional annotation from Gramene n/a
B73 RefGen_v1 assembly n/a
B73 RefGen_v1 filtered gene models n/a
B73 RefGen_v1 working gene set n/a

WGS (Whole Genome Shotgun) records at GenBank:
Zm-B73-REFERENCE-GRAMENE-4.0
Zm-B73-REFERENCE-NAM-5.0


V4 Functional annotation from Phytozome 10 (log in required)


Cross reference for GRMZM and ZEAMMB73 IDs


Publications

Hufford et al., 2021 De novo assembly, annotation, and comparative analysis of 26 diverse maize genomes. (Preprint)

Jiao et al., 2017. Improved maize reference genome with single-molecule technologies.

Jiao et al., 2017. Improved maize reference genome with single-molecule technologies.

Law et al., 2015. Automated update, revision, and quality control of the maize genome annotations using MAKER-P improves the B73 RefGen_v3 gene models and identifies new genes.

Wei et al., 2009. The physical and genetic framework of the maize B73 genome.

Schnable et al., 2009. The B73 maize genome: complexity, diversity, and dynamics.

Wei et al., 2007. Physical and Genetic Structure of the Maize Genome Reflects Its Complex Evolutionary History.

Bi et al., 2006. Single Nucleotide Polymorphisms and Insertion–Deletions for Genetic Markers and Anchoring the Maize Fingerprint Contig Physical Map.

Gardiner et al., 2004. Anchoring 9,371 maize expressed sequence tagged unigenes to the bacterial artificial chromosome contig map by two-dimensional overgo hybridization.

Coe et al., 2002. Access to the Maize Genome: An Integrated Physical and Genetic Map.

Yim et al., 2002. Characterization of Three Maize Bacterial Artificial Chromosome Libraries toward Anchoring of the Physical Map to the Genetic Map Using High-Density Bacterial Artificial Chromosome Filter Hybridization.

FAQs

What is a Reference Genome?
What is a Representative Genome?
What are the main changes between the v4 and v5 assemblies?
What are the main changes between the v3 and v4 assemblies?
What are the main changes between the v2 and v3 assemblies?
Why was a preliminary v5 annotation released?
How can I map positions between the v2 and v3 assemblies?
Where can I find legacy resources from MaizeSequence.Org?
How can I identify the Filtered Gene Set (FGS) in RefGen_v3?
Where can I download a GFF dump of the FGS for maize genes in v3 (5b+)?


What is a Reference Genome?

A Reference Genome is a haploid representation of a genome as DNA sequence with a defined coordinate system, and accession and version identification. A Reference genome is usually assembled de novo, rather than relying on related genomes for assembly of small DNA fragments (which would be a reference guided assembly). A Reference Genome usually includes the structural annotations, or gene models, derived from the sequence assembly. A Reference Genome is almost always a work in progress that gets better with the additional new data over time. Data for improvement is collected continually, and at certain times, new Reference Genome versions come out that incorporate this data. B73 RefGen_v3 is such an updated version.


What is a Representative Genome?

A Representative Genome is a reference-quality genome which is considered to be representative for a species. B73 is the representative maize genome.


What are the main changes between RefGen_v4 (Zm-B73-REFERENCE-GRAMENE-4.0) and Zm-B73-REFERENCE-NAM-5.0?

Zm-B73-REFERENCE-NAM-5.0 is a de novo assembly using improved PacBio long-read technology and BioNano optical maps, using the same tissue sourc as RefGen_v4 (Zm-B73-REFERENCE-GRAMENE-4.0).


What are the main changes between RefGen_v3 and RefGen_v4 (Zm-B73-REFERENCE-GRAMENE-4.0)?

Zm-B73-REFERENCE-GRAMENE-4.0 was a complete de novo assembly using PacBio technology on DNA extracted from a descendant of the accession used for the v1 - v3 assemblies.


What are the main changes between RefGen_v2 and RefGen_v3?

Changes to the assembly include:

  • v3 captured missing gene space in v2 using WGS reads (v2 improved initial BAC assembly using MTP)
  • Several contigs were moved or flipped.


Why was a preliminary v5 annotation released?

A preliminary annotation, Zm00001e.1, was release alongside the v5 genome assembly to put tools into the hands of researchers as soon as possible, but with warnings to not rely on any specific gene models until the formal annotation, Zm00001eb.1 was released.


How can I map positions between the v4 and v5 assemblies?

There is no converter yet available for translating between v4 and v5 positions, but chain files are available here, which can be used with LiftOver or CrossMap to convert sets of coordinates. Be aware that features on the unplaced scaffolds in the v4 assembly will not be correctly translated to the v5 assembly.


How can I map positions between the v2 and v3 assemblies?

Use the Ensembl assembly converter tool at Gramene.


Where can I find legacy resources from MaizeSequence.Org?

At the Gramene ftp archive.


How can I identify the Filtered Gene Set (FGS) in RefGen_v3?

In the 5b+ gene build, the former FGS gene models are indicated as protein-coding.


Where can I download a GFF dump of the FGS for maize genes in v3 (5b+)?

From the Gramene 5b+ ftp folder.



Coming soon to MaizeGDB

Updated September 28th, 2021

Tracks

  • (削除) Ab initio gene models for v5 and NAM assemblies (削除ここまで) - completed
  • (削除) TE annotations (削除ここまで) - completed
  • (削除) Structural Variations (削除ここまで) - completed (based on B73v5 coordinates)
  • (削除) Align SNPs from MaizeSNP50 chip (削除ここまで) - completed
  • Core Bin Markers - available on browsers, but not yet elsewhere
  • HapMap5 → replaced with WGS mapping, in progress
  • (削除) B73v5 methylome (削除ここまで) - completed
  • HiC data

Page updates

  • Core Bin Marker Pages
  • (削除) Additional functional annotations for NAM gene models (削除ここまで) - completed
  • (削除) JBrowse snapshots within gene model pages (削除ここまで) - completed
  • (削除) Pan-gene information on gene pages (削除ここまで) - completed
  • Add physical coordinates for each NAM line to NAM genetic map page

Tools

  • MaizeMine update to include v5 annotations
  • (削除) qTeller update to include NAM RNA-seq data (削除ここまで) - completed

Features

  • (削除) Pan-genome and syntenic ortholog information (削除ここまで) - completed
  • External links to Apollo instances for manual structural annotations
  • (削除) V5 gene models linked to gene data (削除ここまで) - completed
  • (削除) Current v5 gene models are evidence-based. Ab initio gene models will be added when available. (削除ここまで) - completed
  • Create JBrowse instances for non-NAM assemblies - done for W22 and Mo17 CAU
  • (削除) Deploy JBrowse plugins to combine bigWig tracks (削除ここまで) - completed

Project Details Metadata Browser

Information about assembly B73 RefGen_v3 (also known as AGPv1)

Assembly identifier: Zm00001c

Click here to learn about maize genome and gene model nomenclature rules.
This assembly has been replaced with Zm-B73-REFERENCE-GRAMENE-4.0.

Genome Sequencing Project Information

Project name Version 3 of the maize B73 reference genome
GenBank BioProject PRJNA72137
Project PI Doreen Ware
Project start data 2013年01月01日
Release date 2013
Changes to previous version 500 fosmid clones were sequenced and finished and an 8x shotgun sequence was generated via 454 sequencing. BAC assemblies were improved for BACs in the tiling path, and gaps in v2 were filled with resultant sequence contigs.
Funding NSF: Improving the Sequence of the Maize Genome
Project reference A panoply of genomics techniques to update the Zea mays B73 reference sequence and annotations. Andrew J. Olson, Joshua C. Stein, Shiran Pasternak, Jeffrey C. Glaubitz, Edward S. Buckler, Fusheng Wei, Jianwei Zhang, Rod A. Wing, Robert S. Fulton, Richard K. Wilson, Ethalinda K.S. Cannon, Carson M. Andorf, Carolyn J. Lawrence

Stock and Biosample Information

Stock information
Stock name Coe PI 550473
Stock record 47638
Stock details Coe PI 550473
Biosample information
Species Zea mays ssp. mays
Sample name Coe PI 550473
Sample description The source for the inbred line B73 used to make the BAC libraries that were sequenced is available from the North Central Regional Plant Introduction Station through the U.S. National Plant Germplasm System under the accession PI 550473. When requesting seed from the North Central Regional Plant Introduction Station, ask for any lot descended from the Coe PI 550473 lines.
GenBank BioSample SAMN02981394
Location Ames, Iowa, USA

Sequencing and Assembly Information

Assembly name B73 RefGen_v3
Assembly date 2013
Assembly accession GCF_000005005.1
WGS accession AHID00000000
Contributors Andrew J. Olson, Joshua C. Stein, Shiran Pasternak, Jeffrey C. Glaubitz, Edward S. Buckler, Fusheng Wei, Jianwei Zhang, Rod A. Wing, Robert S. Fulton, Richard K. Wilson, Ethalinda K.S. Cannon, Carson M. Andorf, Carolyn J. Lawrence
Assembly provider Ware lab
Sequencing description Sequencing technologies: Sanger and 454
Genome coverage: 6x
Assembly description Assembly methods: phredPhrap v. 2009 and ABySS v. 1.2.7 and
Construction of pseudomolecules: Map-based order and orientation of a BAC tiling path, with some gaps filled with 454 contigs.
Browse Genome Genome browser at MaizeGDB
Data download ftp://ftp.ensemblgenomes.org/pub/plants/release-22/fasta/zea_mays/dna
https://download.maizegdb.org/B73_RefGen_v3/
ftp://ftp.ncbi.nlm.nih.gov/genomes/genbank/plant/Zea_mays/all_assembly_versions/GCA_000005005.5_B73_RefGen_v3
Release date 2013
Finishing strategy Complete genome
Seq service provider Roche
Assembly statistics
Scaff num 523
N50 scaff length 8,225,948 bp
N50 scaff count 79
N90 scaff length 595,319 bp
N90 scaff count 366
N50 contig length 13,961 bp
N50 contig count 41,305
Total number of scaffolds in assembly.
The length of scaffold which takes the sum length (summing from longest to shortest scaffold) past 50% of the total assembly size.
How many scaffolds are counted in reaching the N50 threshold.
The length of scaffold which takes the sum length (summing from longest to shortest scaffold) past 90% of the total assembly size.
How many scaffolds are counted in reaching the N90 threshold.
The length of contig which takes the sum length (summing from longest to shortest contig) past 50% of the total assembly size.
How many contig are counted in reaching the N50 threshold.
A contig is a contiguous consensus sequence that is derived from a collection of overlapping reads.
A scaffold is set of a ordered and orientated contigs that are linked to one another by mate pairs of sequencing reads.

Annotation

Annotation Identifier 5b+
Annotation Provider Ware lab
Annotation Date 2013
Is current yes
Annotation Software Gramene evidence-based gene build pipeline and FGENESH
Data download https://download.maizegdb.org/B73_RefGen_v3/
Project details Metadata Browser


View the browser in full screen
[フレーム]

  • Contact
  • Please cite us!
  • USDA

AltStyle によって変換されたページ (->オリジナル) /